Cosine Siamese Models for Stance Detection

نویسندگان

  • Akshay Agrawal
  • Delenn Chin
  • Kevin Chen
چکیده

Fake news detection has received much attention since the 2016 United States Presidential election, where election outcomes are thought to have been influenced by the unregulated abundance of fake news articles. The recently released Fake News Challenge (FNC) aims to address the fake news problem by decomposing the problem into distinct NLP tasks, the first of which is stance detection. We employ a neural architecture consisting of two homologous subnetworks for headline and body processing and a subsequent node for headline/body comparison and stance prediction. Headlines and bodies are represented with a weighted bag-of-words combination of word vectors passed through a ReLU, where the weights are learned. Stance is quantified by computing the cosine similarity of these weighted bag-of-words representations, and the score is regressed to a relaxed, continuous label space in which the true discrete labels are posited to lie. Our model, which outperforms other recurrent methods, achieves an FNC score of 0.891 out of 1.00, a 0.10 increase from the published 0.79 FNC baseline. The cosine similarity function induces a natural geometry among the learned headline and body representations, with unrelated inputs generally orthogonal to each other and agreeing inputs nearly collinear. Our findings implicate the importance of the optimization objective, as opposed to the architecture of the subnetwork models, to success in stance detection, echoing recent work demonstrating the competitiveness of weighted bag-of-words models for textual similarity tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Weakly Supervised One-Shot Detection with Attention Siamese Networks

We consider the task of weakly supervised one-shot detection. In this task, we attempt to perform a detection task over a set of unseen classes, when training only using weak binary labels that indicate the existence of a class instance in a given example. The model is conditioned on a single exemplar of an unseen class and a target example that may or may not contain an instance of the same cl...

متن کامل

Preliminary investigation of Boltzmann machine classifiers for speaker recognition

We propose a novel generative approach to speaker recognition using Boltzmann machines, a fledgeling non-Gaussian probabilistic framework that is increasingly gaining attention in several machine learning fields. We show how a modified i-vector representation of speech utterances enables the development of several Boltzmann machine architectures for speaker verification and we report some preli...

متن کامل

A Relevance Score Estimation for Spoken Term Detection Based on RNN-Generated Pronunciation Embeddings

In this paper, we present a novel method for term score estimation. The method is primarily designed for scoring the out-of-vocabulary terms, however it could also estimate scores for in-vocabulary results. The term score is computed as a cosine distance of two pronunciation embeddings. The first one is generated from the grapheme representation of the searched term, while the second one is com...

متن کامل

Class-balanced siamese neural networks

This paper focuses on metric learning with Siamese Neural Networks (SNN). Without any prior, SNNs learn to compute a non-linear metric using only similarity and dissimilarity relationships between input data. Our SNN model proposes three contributions: a tuple-based architecture, an objective function with a norm regularisation and a polar sine-based angular reformulation for cosine dissimilari...

متن کامل

Content-based Representations of audio using Siamese neural networks

In this paper, we focus on the problem of content-based retrieval for audio, which aims to retrieve all semantically similar audio recordings for a given audio clip query. This problem is similar to the problem of query by example of audio, which aims to retrieve media samples from a database, which are similar to the user-provided example. We propose a novel approach which encodes the audio in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017